Introduction

Videogames are one of my favorite past times. As a player who participates in ranked play, I’ve always kept my eye on the forefront of global competitions. One of the most notable games to establish a global competitive scene backed by paid professionals was the real-time strategy game (RTS) Starcraft II (SC2). In 2013 the top 10 starcraft players by earnings made nearly four-million dollars from their combined winnings (“Winnings: 2019 - Liquipedia - the Starcraft Ii Encyclopedia” n.d.). Watching top-caliber players reflexes and control astonishing is even seasoned videogame enthusiasts. At the 2019 StarCraft II World Championship Finale, I and many others packed into the arena to see what these professions could do firsthand.

‘(“Congrats to the Starcraft Ii Wcs Global Finals Champion! - Blizzcon” n.d.)’

(“Congrats to the Starcraft Ii Wcs Global Finals Champion! - Blizzcon” n.d.)

The eye watering speeds they perform at is typically referenced in gaming terminology as actions per minute (APMS). Professionals take actions at such fastspeeds (high APMS) it becomes challenging to follow their overall strategy. So past pondering their sheer speed, I found it difficult to distinctly define what made these players professionals.

To learn more about what defines talent in SC2 this analysis we will explore in game metrics in attempts to predict their rank in competitive mode. The dataset used was provided by (“UCI Machine Learning Repository: SkillCraft1 Master Table Dataset Data Set” n.d.).

Goal

To model the response LeagueIndex we will explore a sample of player data from a 2013 ranked season of Starcraft. The predictors provided summarize in game performance metrics for a season by player (GameID). The modeling process will consider all the predictor variables and then trim down until only significant predictors remain. Variables will be vetted for multicolinearity and finally The model will be explored for to see if the BLUE assumptions hold.

Limitations of the Model

The multivariate regression model used for the midterm 2 portions of this study explores the linear estimation of mean response of LeagueIndex estimated by predictors in the design matrix \(X\). The validity of this models approximations assumes LeagueIndex is gaussian distributed and can vary continuously but LeagueIndex is a ordinal variable. The levels of LeagueIndex range (1:8) corresponding to player ranks Bronze, Silver, Gold, Platinum, Diamond, Master, Grand Master, and Professional league. Game ranking systems frequently are based on ELO/MMR that varies over a much larger range typically ~1200-3000. These ranges are then masked with medal ranks as listed above and then further subdivided into divisions within each medal (“Leagues: 2019 - Liquipedia - the Starcraft Ii Encyclopedia” n.d.) . Using either MMR or having the subdivisions of each player would provide some much needed continuity but unfortunately neither of these metrics are available. These limitation will be revisited more specifically along the exploration, modeling, and the predictions of the values.

A more suitable form of model for this regression would be based off a Polytonomous Logistic Regression for Ordinal Response (Proportional Odds Model) (“Ordinal Logistic Regression | R Data Analysis Examples” n.d.). These methods will be revisted for the final portion of this analysis.

The Data Exploring

This dataset is a sample of averaged in-game metrics of Starcraft II players who participate in 2013 ranked play. The variables are as follows:

##  [1] "GameID"               "LeagueIndex"          "Age"                 
##  [4] "HoursPerWeek"         "TotalHours"           "APM"                 
##  [7] "SelectByHotkeys"      "AssignToHotkeys"      "UniqueHotkeys"       
## [10] "MinimapAttacks"       "MinimapRightClicks"   "NumberOfPACs"        
## [13] "GapBetweenPACs"       "ActionLatency"        "ActionsInPAC"        
## [16] "TotalMapExplored"     "WorkersMade"          "UniqueUnitsMade"     
## [19] "ComplexUnitsMade"     "ComplexAbilitiesUsed"

The appendix covers each in depth but the following are highlighted because their used in the final analysis.

LeagueIndex is was covered thoroughy in the above section.

‘(“Congrats to the Starcraft Ii Wcs Global Finals Champion! - Blizzcon” n.d.)’

Perception Action Cycles (PACs) are the circular flow of information between an organism and its environment where a sensor-guided sequence of behaviors are iteratively followed towards a goal (“Perception-Action Cycle - Models, Architectures, and Hardware | Vassilis Cutsuridis | Springer” n.d.). In this dataset PACs are aggregate of screen movements where PAC is a screen fixation of containing at least one action (“UCI Machine Learning Repository: SkillCraft1 Master Table Dataset Data Set” n.d.).

Cleanliness

The missing values are related exclusive to players with LeagueIndex equivalent to Professional Players (8). Where the 55 players with LeagueIndex=8 the age data is NA and the HoursPerWeek are 0. LeagueIndexes 1-7 are obtainable natural game play, to be a professional you would have to be part of a team. I am aiming to understand how players go from being average to good, less so elite to best so the NA associated with professionals will be dropped.

[Question for Colin]: Furthermore while Bronze through Master leagues (LeagueIndex’s 1-6) may contain any number of players, Grand Master (LeagueIndex=7) may only contain 200 players. To only analyze portions of the data that are “more normal” the 55 players LeagueIndex==7 data will be dropped.

The following will call the columns that are problematic for the analysis.

## [1] "Age Values"
## missing 
##    "55"
## [1] "LeagueIndex Values"
## missing 
##     "0"
## [1] "Hours Per Week Lower Extreme Values"
## named numeric(0)

In addition to the missing values we have a clear error with the total hours of one player. \(GameID = 5140\) has 1,000,000 total hours that equates to 114 years of game time. If we assumed one extra 0 was added the player would still have 14 years of played time on a game that is only 10 years old as of a 2020. Removing two 0s equates to 1.4 years of played time which is much more realistic. Removing 3 zeros results in 51.1 days of played time and that seems just as realistic. There is not a clear path to extrapolate this player’s true total hours so their data will be dropped from the analysis. This was originally detected during modeling, but brought earlier into that analysis because how obviously unintentional this value is.

## sc$TotalHours 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     3338       57      237    0.998    960.4     1159    87.55   125.00 
##      .25      .50      .75      .90      .95 
##   300.00   500.00   800.00  1200.00  1566.00 
## 
## lowest :       3       7      10      12      16
## highest:   10260   18000   20000   25000 1000000
##                                   
## Value      0e+00 1e+04 2e+04 1e+06
## Frequency   3329     5     3     1
## Proportion 0.997 0.001 0.001 0.000
## 
## For the frequency table, variable is rounded to the nearest 10000

Finally, preforming basic a inspection on hoursperweek a max value of 168 was discovered. Considering there are 168 hours in a week its not plausible for a individual player to do this. There could be multiple players using this account making this possible. Another prospect is that this player is actually an AI like google’s DeepMind [https://deepmind.com/blog/article/alphastar-mastering-real-time-strategy-game-starcraft-ii]. Either way this observation will be kept because what is realistic cutoff for the hours per week is not apparent and after removing the observation the next max value is 140 which seems almost as unrealistic.

Its worth noting that dropping any amount of high hour outliers far from combats all the issues encouterable through HoursPerWeek and or TotalHours. Multiple players could be using any of the accounts even if the hour are not high. More likely then that many of these accounts could be smurf accounts. Smurfing is when a player makes a additional accounts to have a match making rating reset [https://www.lol-smurfs.com/blog/what-is-a-smurf-account/]. A common reason for doing this is to dominate the competition until their rating adapts to their skill level.

Converting Units

Some of the time averaged metrics are per SC2 timestamp while other are per milisecond. To make these metrics more interpretable each metric will be converted into seconds. There are roughly 88.5 timestamps per second so each metric in timestamps will be multiplied that as a coefficient (“UCI Machine Learning Repository: SkillCraft1 Master Table Dataset Data Set” n.d.).

Some of the time averaged metrics are per milisecond. These will be transformed into seconds so the time units are completely uniform.

Both of these transformations are linear and will not affect our model’s assumptions.

Summary Statistics and Plots

summary(sc)
##      GameID      LeagueIndex        Age         HoursPerWeek   
##  Min.   :  52   Min.   :1.00   Min.   :16.00   Min.   :  2.00  
##  1st Qu.:2422   1st Qu.:3.00   1st Qu.:19.00   1st Qu.:  8.00  
##  Median :4788   Median :4.00   Median :21.00   Median : 12.00  
##  Mean   :4720   Mean   :4.12   Mean   :21.65   Mean   : 15.91  
##  3rd Qu.:6995   3rd Qu.:5.00   3rd Qu.:24.00   3rd Qu.: 20.00  
##  Max.   :9271   Max.   :7.00   Max.   :44.00   Max.   :168.00  
##    TotalHours           APM         SelectByHotkeys  AssignToHotkeys  
##  Min.   :    3.0   Min.   : 22.06   Min.   :0.0000   Min.   :0.00000  
##  1st Qu.:  300.0   1st Qu.: 79.22   1st Qu.:0.1103   1st Qu.:0.01785  
##  Median :  500.0   Median :107.07   Median :0.2164   Median :0.03086  
##  Mean   :  661.1   Mean   :114.53   Mean   :0.3556   Mean   :0.03222  
##  3rd Qu.:  800.0   3rd Qu.:140.11   3rd Qu.:0.4372   3rd Qu.:0.04361  
##  Max.   :25000.0   Max.   :389.83   Max.   :3.8133   Max.   :0.14587  
##  UniqueHotkeys   MinimapAttacks     MinimapRightClicks  NumberOfPACs    
##  Min.   :  0.0   Min.   :0.000000   Min.   :0.00000    Min.   :0.06009  
##  1st Qu.:265.5   1st Qu.:0.000000   1st Qu.:0.01228    1st Qu.:0.24277  
##  Median :354.0   Median :0.003420   Median :0.02464    Median :0.29881  
##  Mean   :382.0   Mean   :0.008303   Mean   :0.03365    Mean   :0.30384  
##  3rd Qu.:531.0   3rd Qu.:0.010053   3rd Qu.:0.04495    3rd Qu.:0.35426  
##  Max.   :885.0   Max.   :0.267212   Max.   :0.32636    Max.   :0.70540  
##  GapBetweenPACs   ActionLatency     ActionsInPAC    TotalMapExplored
##  Min.   :  6667   Min.   : 24633   Min.   : 2.039   Min.   : 5.00   
##  1st Qu.: 29328   1st Qu.: 50915   1st Qu.: 4.261   1st Qu.:17.00   
##  Median : 37059   Median : 61296   Median : 5.086   Median :22.00   
##  Mean   : 40714   Mean   : 64217   Mean   : 5.267   Mean   :22.12   
##  3rd Qu.: 48507   3rd Qu.: 74036   3rd Qu.: 6.027   3rd Qu.:27.00   
##  Max.   :237143   Max.   :176372   Max.   :18.558   Max.   :58.00   
##   WorkersMade       UniqueUnitsMade  ComplexUnitsMade   ComplexAbilitiesUsed
##  Min.   :0.006813   Min.   : 177.0   Min.   :0.000000   Min.   :0.000000    
##  1st Qu.:0.060331   1st Qu.: 442.5   1st Qu.:0.000000   1st Qu.:0.000000    
##  Median :0.079987   Median : 531.0   Median :0.000000   Median :0.001814    
##  Mean   :0.091213   Mean   : 579.0   Mean   :0.005312   Mean   :0.012569    
##  3rd Qu.:0.111342   3rd Qu.: 708.0   3rd Qu.:0.007737   3rd Qu.:0.016145    
##  Max.   :0.455713   Max.   :1150.5   Max.   :0.079851   Max.   :0.272907

Correlation Plot

Visually we can see that LeagueIndex has a relatively strong correlation with APM, SelectByHotkeys, AssignToHotkeys, NumberofPACs, GapBetweenPACs, and Action Latency. Some of these predictors may be the best choices for model, although its worth noting at this point many of the predictor values also have fairly strong correlations within themselves which may cause multiplecolinearity in a model. This is not too surprising because many of these metrics capture rate of actions in slightly difference forms. For example APM and NumberOfPacs likely have a strict mathematical relationship where approximately \(NumberOfPACs=APM*MatchDurationMinutes\). The slight differences between these metrics them could have some deep explanatory power but that level of explore ration is beyond the scope of this analysis.

The following columns will be dropped because plausible confoundance with APMs: - Remove Action Latency - Remove GapBetweenPACs1 - Remove NumberofPACS - Remove SelectbyHotkey - Remove ActionsInPAC

Focusing exclusively on APMs fit into an Occam’s razor approach by minimizing the dimensionality of the models space.

sc<-sc%>%select(!c(GameID,ActionLatency,GapBetweenPACs,NumberOfPACs,SelectByHotkeys,ActionsInPAC))

Visual Inspection

Visually determining trends between the predictors and the response with a response is that ordinal requires tossing aside scatter plots for a better suited tools. Violinplots will be used to gauge the linearity in relation to the response and distribution with variable at the varying levels https://chartio.com/learn/charts/violin-plot-complete-guide/.

Age the mean age of 21.6531775 does not vary much across LeaguIndex such that there is no stark linear relationship. Altought the variance at the highlest level seems to be much narrower then that at the lower levels.

HourPerWeek

** c()

ggplot(sc, aes(x=factor(LeagueIndex), y=Age, fill=LeagueIndex)) +
  geom_violin(trim=FALSE, color="blue") + stat_summary(fun.data=mean_sdl, mult=1, 
                 geom="pointrange", color="darkblue")+ coord_flip()
## Warning: Ignoring unknown parameters: mult

## Warning: Ignoring unknown parameters: mult

## Warning: Ignoring unknown parameters: mult

## Warning: Ignoring unknown parameters: mult

## Warning: Ignoring unknown parameters: mult

## Warning: Ignoring unknown parameters: mult

## Warning: Ignoring unknown parameters: mult

## Warning: Ignoring unknown parameters: mult

## Warning: Ignoring unknown parameters: mult
## Warning: Removed 872 rows containing non-finite values (stat_ydensity).
## Warning: Removed 872 rows containing non-finite values (stat_summary).
## Warning: Removed 72 rows containing non-finite values (stat_ydensity).
## Warning: Removed 72 rows containing non-finite values (stat_summary).

## Warning: Ignoring unknown parameters: mult
## Warning: Ignoring unknown parameters: mult

## Warning: Ignoring unknown parameters: mult

## Warning: Ignoring unknown parameters: mult

Gaussianity of the Response

When using Shapiro-Wilk W test test on response LeagueIndex we find that we can reject the idea that the response come from a normally distributed population. Besides the obvious issues with performing a W test with an ordinal response, the the response has a heavy tail with a mean of 4.12. Additionally we cannot assume the levels between LeagueIndexes are uniforming spaced.

[Should I Include:] While the population distribution of ranking is difficult to come by, leagues like Grandmaster (LeagueIndex = 7 ) are capped at 200 which forcibly prevents LeagueIndex 1-7 from having a gaussian distribution.

Gaussanity of the Predictors

Taking a glance at the predictors with high correlation to LeagueIndex we find that many have none normal distributions. Which could affect our models gaussanity of a resulting model.

Model Specifications

Model Assumptions at Risk

Manual Regression Iterations

I dropped age for as a insignificant predictor, that may mean its not too late for me!

5 Justify your choices using hypothesis testing and confidence intervals for selection of parameters. You should compare multiple model choices in describing this relationship, including the null model.

6 Evaluate the goodness of fit of the model in terms of R2 and the standard errors, and the major sources of uncertainty. This includes parameter uncertainty, as well as structural uncertainty in the model.

Before beginning diagnostic tests, a model with all predictors will be made. Subsequently predictors will be dropped one by one until only predictors with significance of at least$=5%% remain. The results are as follows:

The null model summary:

## 
## Call:
## lm(formula = LeagueIndex ~ ., data = sc)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.3438 -0.6987  0.0463  0.7539  3.1312 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           1.461e+00  1.366e-01  10.692  < 2e-16 ***
## Age                  -2.291e-03  4.577e-03  -0.501  0.61669    
## HoursPerWeek          5.111e-03  1.644e-03   3.109  0.00189 ** 
## TotalHours            1.655e-04  2.284e-05   7.246 5.31e-13 ***
## APM                   1.232e-02  5.256e-04  23.448  < 2e-16 ***
## AssignToHotkeys       1.342e+01  1.232e+00  10.892  < 2e-16 ***
## UniqueHotkeys         4.731e-04  1.013e-04   4.671 3.12e-06 ***
## MinimapAttacks        1.120e+01  1.385e+00   8.083 8.77e-16 ***
## MinimapRightClicks   -7.756e-01  6.249e-01  -1.241  0.21468    
## TotalMapExplored      5.903e-03  3.135e-03   1.883  0.05977 .  
## WorkersMade           2.711e+00  4.408e-01   6.151 8.60e-10 ***
## UniqueUnitsMade       3.448e-05  1.434e-04   0.240  0.81002    
## ComplexUnitsMade      1.954e+00  2.508e+00   0.779  0.43592    
## ComplexAbilitiesUsed  1.439e+00  1.007e+00   1.429  0.15308    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.062 on 3322 degrees of freedom
## Multiple R-squared:  0.4641, Adjusted R-squared:  0.462 
## F-statistic: 221.3 on 13 and 3322 DF,  p-value: < 2.2e-16

After 4 iterations the all predictors were significant. The final iteration provided:

## 
## Call:
## lm(formula = LeagueIndex ~ HoursPerWeek + TotalHours + APM + 
##     AssignToHotkeys + UniqueHotkeys + MinimapAttacks + WorkersMade + 
##     ComplexAbilitiesUsed, data = sc)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.3986 -0.6914  0.0559  0.7595  3.0855 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          1.507e+00  5.843e-02  25.795  < 2e-16 ***
## HoursPerWeek         5.276e-03  1.626e-03   3.245  0.00119 ** 
## TotalHours           1.656e-04  2.281e-05   7.260 4.79e-13 ***
## APM                  1.233e-02  5.052e-04  24.412  < 2e-16 ***
## AssignToHotkeys      1.354e+01  1.231e+00  10.997  < 2e-16 ***
## UniqueHotkeys        5.142e-04  9.884e-05   5.203 2.08e-07 ***
## MinimapAttacks       1.119e+01  1.355e+00   8.257  < 2e-16 ***
## WorkersMade          2.757e+00  4.331e-01   6.366 2.20e-10 ***
## ComplexAbilitiesUsed 2.341e+00  7.983e-01   2.932  0.00339 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.063 on 3327 degrees of freedom
## Multiple R-squared:  0.4627, Adjusted R-squared:  0.4614 
## F-statistic: 358.2 on 8 and 3327 DF,  p-value: < 2.2e-16

Testing for Signifance of Modeling

Our initial \(H_o\) is there is no systematic structure to the response LeagueIndex. Our alternative \(H_a\) is there is some relation such that \(LeagueIndex=X\beta+\epsilon\), where X all other variables with the exception of index GameID. Without much surprise using 18 predictor variables results of a very small p-value of ~0. Over the iteration this does not change in a notable fashion across the 9 other models as the last model also results in a p-value0. Thus all iterations of the model we can reject the null hypothesis suggesting that we should further investigate the explanatory power of our alternative hypothesis.

Noramlity of Residuals Both Models

plot(sc_lm)

plot(sc_lm_final)

Testing for Signifance Between Models

Our second null test is to determine if the the starting and final model’s have significantly different residuals. The \(H_o\) specifically is that \(RSS_{lm1}\) = \(RSS_{lm4}\) and \(H_a\) is that \(RSS_{lm1}\) \(RSS_{lmfinal}\). Performing an ANOVA test we find that there is not significant difference in the models at \(\alpha=5\%\) such that we cannot reject the \(H_o\).(ANOVA see table below). Additionally their \(adjR^2\) is barely different. Starting at 0.46 and ending at 0.46.

## Analysis of Variance Table
## 
## Model 1: LeagueIndex ~ Age + HoursPerWeek + TotalHours + APM + AssignToHotkeys + 
##     UniqueHotkeys + MinimapAttacks + MinimapRightClicks + TotalMapExplored + 
##     WorkersMade + UniqueUnitsMade + ComplexUnitsMade + ComplexAbilitiesUsed
## Model 2: LeagueIndex ~ HoursPerWeek + TotalHours + APM + AssignToHotkeys + 
##     UniqueHotkeys + MinimapAttacks + WorkersMade + ComplexAbilitiesUsed
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1   3322 3748.3                           
## 2   3327 3757.7 -5    -9.392 1.6648 0.1396
confint(sc_lm_4)
##                             2.5 %       97.5 %
## (Intercept)          1.265054e+00 1.547625e+00
## HoursPerWeek         2.113622e-03 8.484853e-03
## TotalHours           1.198681e-04 2.092853e-04
## APM                  1.123781e-02 1.322460e-02
## AssignToHotkeys      1.111064e+01 1.593554e+01
## UniqueHotkeys        2.740538e-04 6.678158e-04
## MinimapAttacks       8.188274e+00 1.352614e+01
## TotalMapExplored     1.157379e-03 1.167418e-02
## WorkersMade          1.866932e+00 3.565515e+00
## ComplexAbilitiesUsed 3.212798e-01 3.523832e+00

Exploring the chosen model

Evaluate the goodness of fit of the model in terms of R2 and the standard errors, and the major sources of uncertainty. This includes parameter uncertainty, as well as structural uncertainty in the model.

summary(sc_lm_4,cor='T')
## 
## Call:
## lm(formula = LeagueIndex ~ HoursPerWeek + TotalHours + APM + 
##     AssignToHotkeys + UniqueHotkeys + MinimapAttacks + TotalMapExplored + 
##     WorkersMade + ComplexAbilitiesUsed, data = sc)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.3738 -0.7012  0.0460  0.7562  3.1244 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          1.406e+00  7.206e-02  19.516  < 2e-16 ***
## HoursPerWeek         5.299e-03  1.625e-03   3.262  0.00112 ** 
## TotalHours           1.646e-04  2.280e-05   7.217 6.53e-13 ***
## APM                  1.223e-02  5.067e-04  24.141  < 2e-16 ***
## AssignToHotkeys      1.352e+01  1.230e+00  10.991  < 2e-16 ***
## UniqueHotkeys        4.709e-04  1.004e-04   4.690 2.84e-06 ***
## MinimapAttacks       1.086e+01  1.361e+00   7.976 2.06e-15 ***
## TotalMapExplored     6.416e-03  2.682e-03   2.392  0.01680 *  
## WorkersMade          2.716e+00  4.332e-01   6.271 4.06e-10 ***
## ComplexAbilitiesUsed 1.923e+00  8.167e-01   2.354  0.01863 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.062 on 3326 degrees of freedom
## Multiple R-squared:  0.4637, Adjusted R-squared:  0.4622 
## F-statistic: 319.5 on 9 and 3326 DF,  p-value: < 2.2e-16
## 
## Correlation of Coefficients:
##                      (Intercept) HoursPerWeek TotalHours APM   AssignToHotkeys
## HoursPerWeek         -0.21                                                    
## TotalHours            0.01       -0.20                                        
## APM                  -0.17       -0.15        -0.16                           
## AssignToHotkeys      -0.05       -0.04         0.02      -0.40                
## UniqueHotkeys        -0.16        0.03         0.00      -0.13 -0.26          
## MinimapAttacks        0.08       -0.01        -0.08      -0.08 -0.09          
## TotalMapExplored     -0.59        0.01        -0.02      -0.08 -0.01          
## WorkersMade          -0.29        0.05        -0.01      -0.32  0.01          
## ComplexAbilitiesUsed  0.13       -0.04        -0.01       0.00 -0.09          
##                      UniqueHotkeys MinimapAttacks TotalMapExplored WorkersMade
## HoursPerWeek                                                                  
## TotalHours                                                                    
## APM                                                                           
## AssignToHotkeys                                                               
## UniqueHotkeys                                                                 
## MinimapAttacks       -0.04                                                    
## TotalMapExplored     -0.18         -0.10                                      
## WorkersMade           0.03          0.00          -0.04                       
## ComplexAbilitiesUsed  0.00          0.03          -0.21            -0.05
plot(sc_lm_4)

Model Analysis Reform (Trim and then confound)

Describe your model, how you arrived at it, its goodness of fit, its significance versus other choices of models, and its uncertainty. Describe the predictive power, and the uncertainty. Include relevant tables and figures.

Predict

Evaluate the predictive power of the model – particularly, how effective does the model appear to be at making predictions of future observations or the mean response. How might these predictions be unreliable? What are the limits of the prediction power, and where do we fall into extrapolation?

Prelude to final

7 Describe your proposed research question for the final. How will you revise your original research question? What issues have you encountered so far? What assumptions do you think you need to (re-)evaluate?

For the final, the logistical regression will remodel the same problem with a different set of techniques and assumptions that fit the ordinal response.

The goal is for the analysis pull in additional regression techniques while still integrating the previous exploratory exercises.

Appendix

About Columns

Attribute Information:

  1. GameID: Unique ID number for each game (integer)
  2. LeagueIndex: Bronze, Silver, Gold, Platinum, Diamond, Master, GrandMaster, and Professional leagues coded 1-8 (Ordinal)
  3. Age: Age of each player (integer)
  4. HoursPerWeek: Reported hours spent playing per week (integer)
  5. TotalHours: Reported total hours spent playing (integer)
  6. APM: Action per minute (continuous)
  7. SelectByHotkeys: Number of unit or building selections made using hotkeys per timestamp (continuous)
  8. AssignToHotkeys: Number of units or buildings assigned to hotkeys per timestamp (continuous)
  9. UniqueHotkeys: Number of unique hotkeys used per timestamp (continuous)
  10. MinimapAttacks: Number of attack actions on minimap per timestamp (continuous)
  11. MinimapRightClicks: number of right-clicks on minimap per timestamp (continuous)
  12. NumberOfPACs: Number of PACs per timestamp (continuous)
  13. GapBetweenPACs: Mean duration in milliseconds between PACs (continuous)
  14. ActionLatency: Mean latency from the onset of a PACs to their first action in milliseconds (continuous)
  15. ActionsInPAC: Mean number of actions within each PAC (continuous)
  16. TotalMapExplored: The number of 24x24 game coordinate grids viewed by the player per timestamp (continuous)
  17. WorkersMade: Number of SCVs, drones, and probes trained per timestamp (continuous)
  18. UniqueUnitsMade: Unique unites made per timestamp (continuous)
  19. ComplexUnitsMade: Number of ghosts, infestors, and high templars trained per timestamp (continuous)
  20. ComplexAbilitiesUsed: Abilities requiring specific targeting instructions used per timestamp (continuous)

References

“Congrats to the Starcraft Ii Wcs Global Finals Champion! - Blizzcon.” n.d. Accessed October 26, 2020. https://blizzcon.com/en-us/news/23198508/congrats-to-the-starcraft-ii-wcs-global-finals-champion.

“Leagues: 2019 - Liquipedia - the Starcraft Ii Encyclopedia.” n.d. Accessed October 26, 2020. https://liquipedia.net/starcraft2/Battle.net_Leagues.

“Ordinal Logistic Regression | R Data Analysis Examples.” n.d. Accessed October 27, 2020. https://stats.idre.ucla.edu/r/dae/ordinal-logistic-regression/.

“Perception-Action Cycle - Models, Architectures, and Hardware | Vassilis Cutsuridis | Springer.” n.d. Accessed October 29, 2020. https://www.springer.com/gp/book/9781441914514.

“UCI Machine Learning Repository: SkillCraft1 Master Table Dataset Data Set.” n.d. Accessed October 26, 2020. https://archive.ics.uci.edu/ml/datasets/SkillCraft1+Master+Table+Dataset.

“Winnings: 2019 - Liquipedia - the Starcraft Ii Encyclopedia.” n.d. Accessed October 26, 2020. https://liquipedia.net/starcraft2/Winnings/2019.


  1. Another issue with this is that its unist do not seem to align with the discription. Before and after the unit transformation GapBetweenPACs results in periods that dont make since for the expected duration of SC2 matches.)